The International Journal Of Science & Technoledge (ISSN 2321 - 919X) www.theijst.com 



THE INTERNATIONAL JOURNAL OF 
SCIENCE & TECHNOLEDGE 



Intrusion Detection Using Neutrosophic Classifier 



V. Jaiganesh 

Assistant professor, Department of computer Science, Dr. N.G.P. Arts and Science College, Coimbatore, India 

P. Rutravigneshwaran 

Research Scholar, Department of Computer Science, Dr. N.G.P. Arts and Science College, Coimbatore, India 



Abstract: 

Neutrosophic logic has been applied to network intrusion processing problems recently. A novel approach for intrusion 
thresholding is proposed by defining neutrosophic set in network domain. Neutrosophic is applied to network processing by 
defining a neutrosophic domain. An intruders region growing are noticed based on neutrosophic logic is implemented for 
system traffic. A new approach for network demonizing based on neutrosophic set can also be used. In this dissertation, a 
neutrosophic set is applied to the field of classifiers where an SVM is adopted as the example to validate the feasibility and 
effectiveness of neutrosophic logic. This brand new function of neutrosophic set consists of neutrosophic set that is 
integrated into a formulate SVM, and the concert of the achieve classifier N-SVM is evaluated under a network intrusion 
systems. 
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1. Introduction 

An intruder can be defined as somebody attempting to break into an existing computer. This person is popularly termed as a 
hacker, blackhat or cracker. The number of computers connected to a network and the Internet is increasing with every day. This 
combined with the increase in networking speed has made intrusion detection a challenging process. System administrators today 
have to deal with larger number of systems connected to the networks that provide a variety of services. Overall intrusion 
detection involves defense, detection, and importantly, reaction to the intrusion attempts. An intrusion detection system should try 
to address each of these issues to a high degree. [Balakumar et al.2014]. An insider is a one who has legitimate access to your 
network or computer and is trying to misuse his privileges [Balakumar et al., 2014]. Insider intrusion is usually an attempt to 
alleviate privileges or to gain information by probing misconFigured services or just to create mischief. On an average, 80% of 
security breaches are committed by insiders. Insider attacks are extremely difficult to detect because they happen within a 
protected and mostly unsuspicious environment. Currently, many IDSs are rule-based systems where the performances highly rely 
on the rules identified by security experts. Since the amount of network traffic is huge, the process of encoding rules is expensive 
and slow. Moreover, security people have to modify the rules or deploy new rules manually using a specific rule-driven language 
[Jiong Zhang et al., 2008]. To overcome the limitations of rule-based systems, a number of IDSs employ data mining techniques. 
The intention of this papers is (i) To improve the performance of the classifier in terms of true positive, true negative, false 
positive, false negative, sensitivity and specificity (ii) To improve the classification accuracy of the classifier using fuzzy logic 
based neutrosophic classifier. 

2. Literature Review 

In order to improve detection accuracy and efficiency, a new Feature Selection method based on Rough Sets and improved 
Genetic Algorithms was proposed in Yuteng Guo et al., 2010 for Network Intrusion Detection. Intrusion detection was the act of 
detecting unwanted traffic on a network or a device. Umak et al., 2014 tried to present MSPSO-DT intrusion detection system. 
Where, Multi Swam Particle Swarm Optimization (MSPSO) was used as a feature selection algorithm to maximize the C4.5 
Decision Tree classifier detection accuracy and minimize the timing speed. Traditional signature-based intrusion detection 
methods cannot find previously unknown attacks. Juvonen and Sipola. ,2013 aims to combine unsupervised anomaly detection 
with rule extraction techniques to create an online anomaly detection framework. Aizhong Mi and Linpeng Hai., 2010 applied 
pattern recognition approach based on classifier selection to network intrusion detection and proposed a clustering-based classifier 
selection method. In the method, multiple clusters are selected for a test sample. Then, the average performance of each classifier 
on selected clusters was calculated and the classifier with the best average performance was chosen to classify the test sample. Om 
and Kundu., 2012 proposed a hybrid intrusion detection system that combines k-Means, and two classifiers: K-nearest neighbor 
and Naive Bayes for anomaly detection. It consists of selecting features using an entropy based feature selection algorithm which 
selects the important attributes and removes the irredundant attributes. Katkar and Kulkarni., 2013 evaluates variation in 
performance of Naive Bayesian classifier for intrusion detection when used in combination with different data pre-processing and 
feature selection methods. Due to the effective data analysis method, data mining was introduced into IDS. Nadiammai and 
Hemalatha., 2012 brought an idea of applying data mining algorithms to intrusion detection database. Natesan and Rajesh., 2012 
introduced a new approach called cascading classification model based on AdaBoost and Bayesian Network Classifier that can 
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improve the detection rate of rare network attack categories. In this approach they trained two classifiers with two different 
training sets. The KDD Cup99 dataset was splitted into two training sets where one contains full of non rare attacks datasets and 
other contains datasets of rare attack categories. 

3. Proposed Neutrosophic Classifier 

Fuzzy logic extends classical logic by assigning a membership function ranging in degree between 0 and 1 to variables. As a 
generalization of fuzzy logic, neutrosophic logic introduces a new component called “indeterminacy” and carries more 
information than fuzzy logic. One could expect that the application of neutrosophic logic would lead to better performance than 
fuzzy logic. Neutrosophic logic is so new that its use in many fields merits exploration. In this paper, for the first time, 
neutrosophic logic is applied to the field of classifiers. A neutrosophic set is a generalization of a classical set and a fuzzy set. 
Generally, a neutrosophic set is denoted as <T, I, F>. An element x(t, i, f) belongs to the set in the following way: it is t true, i 
indeterminate, and f false in the set, where t, i, and f are real numbers taken from sets T, I, and F with no restriction on T, I, F, nor 
on their sum m=t+i+f. Figured shows the relationship among classical set, fuzzy set and neutrosophic set. In a classical set, i = 0, t 
and f are either 0 or 1. In a fuzzy set, i = 0,0<t,f<l and t + f = 1. In a neutrosophic set, 0<t,i,f<l. 




Figure 1: Relationship among classical set, fuzzy set and neutrosophic set 

Neutrosophic logic has been applied to network intrusion detection. A novel approach for intrusion thresholding is proposed by 
defining neutrosophic set in network domain. Neutrosophy is applied to network processing by defining a neutrosophic domain. A 
intruders region growing are noticed based on neutrosophic logic is implemented for network traffic. A novel approach for 
network denoising based on neutrosophic set can also be used. In this paper, for the first time, a neutrosophic set is applied to the 
field of classifiers where an SVM is adopted as the example to validate the feasibility and effectiveness of neutrosophic logic. 
This brand new application of neutrosophic logic consists of neutrosophic set that is integrated into a reformulated SVM, and the 
performance of the achieved classifier N-SVM is evaluated under an network intrusion system. 

3.1. Background of SVM 

Given a training set S containing n labeled points (xl, yl),..., (xn, yn), where xjGRN and yje{-l, 1}, j=l, ..., n. Suppose the 
positive and negative samples can be separated by some hyperplane. This means there is a linear function between the positive 
and negative samples with the form: 

d(x)=w.x + b (1) 

For each training sample Xj, d(xj) > 1 if yj = 1; d(xj) < -1, otherwise. This function is also called as decision function. A test sample 
x can be classified as: 

y=sign(d(x)) (2) 

For a given training dataset, many possible hyperplanes could be found to separate the two classes correctly. SVM aims to find an 
optimal solution by maximizing the margin around the separating hyperplane. The solution for a case in two-dimensional space 
has an optimal separating line, as shown in Figure.2. 




Figure.2: An optimal separating line for a two-dimensional space case 
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The support vectors are the points on the hyperplanes: 



yj(w-Xj+b)=l (3) 

For another sample {xi, yi } that is not on the support vector hyperplanes, it has: 

yj(w-Xj+b)=l (4) 

Mathematically, the margin M between two support vectors is finally obtained by: 

Xt = is (5) 

Where llwll is the norm of w. 

Thus, maximizing the margin M is equivalent to minimizing II w II with the constraint that there is no sample between the support 
vector hyperplanes. This constraint can be described as: 



yj(w-Xj+b)>l (6) 

In the case that the original samples could not be separated by any hyperplane, SVM will transform the original samples into a 
higher dimensional space by using a nonlinear mapping. Here, ®(x) denotes the mapping from R N to a higher dimensional space 
Z. A hyperplane needs to be found in the higher dimensional space with maximum margin as: 

w* z + b = 0 (7) 



such that for each point (zj, yj), where zp®(xj): 

yj(w-z +b) > =j l,K,n (8) 

When the dataset is not linearly separable, the soft margin is allowed by introduction of n non -negative variables, denoted by 1, 2, 
( ... ) <%=££> ^n , such that the constraint for each sample in Eq. (8) is rewritten as: 

yj (wzj+b)>l-5 = j l,K,n (9) 

The optimal hyperplane problem is the solution to the problem 

minimize - w.w+ C (10) 

subject to yj(w*Zj +b) > l-£ = j l,K,n (11) 

where the first term in Eq. (3.10) measures the margin between support vectors, and the second term measures the amount of 
misclassifications. C is a constant parameter that tunes the balance between the maximum margin and the minimum classification 
error. Then, for a test point x which is mapped to z in the feature space, the classification result y is given as: 

y=sign(w-zb) (12) 

3.2. Fuzzy SVM 

A membership Sj is assigned for each input sample (xj, yj), where 0< Sj <1. Since the membership Sj is the attitude of the 
corresponding point xj toward one class, and the parameter § j is a measure of error in the SVM, the term Sj^ j is a measure of error 
with different weighting. The optimal hyperplane problem is then regarded as the solution to: 

minimise , m- -f C EfL L Sjfj (13) 
xubj&uL £u Xj + £f) > 1 — £ j — j l,F f vL (14) 

In order to use FSVM, a membership function needs to be defined for each input sample. Here, it used the membership function 
definition. From Eq. (3.10) one can see that if the ^ of a misclassified data xi is increased, the newly learned hyperplane will have 
a tendency to correctly classify xi in order to eliminate the larger error that xi introduced to the classifier and finally minimize Eq. 
(3.10). Correspondingly in Eq. (3.13), assigning a larger membership Si for an input increases the probability of correctly 
classifying that sample while a smaller membership decreases the probability of correctly classifying the sample. Based on this 
observation, the membership function is defined as follows. 




Figure. 3. Different regions in high dimension space 



1. First, a traditional SVM is trained using the original training set. 

2. After step 1, the hyperplane w- z + b = 0 is found. Assuming that if w- z + b > 0 , the data is assigned to the positive 
class; otherwise, the data is assigned to the negative class. There also are two other hyperplanes w- z + b =land w- z + b 
= -1. As indicated in Figure. 4.2, the high dimension space is divided into four regions by these three hyperplanes. For 
the positive samples, region A represents the input points that are correctly classified and the associated § s are 0. Region 
B represents the input points that are also correctly classified but the associated § s are non-zero. Region C and D 
represents the input points that are incorrectly classified. 

3. The points in region A have no contribution to the optimization since their § s are 0. Thus, no matter what membership is 
assigned to them, it will not affect the resultant hyperplane. Here for simplicity, a constant value sA = si is assigned to 
them where 0 < si < 1. 

4. The points in region B are correctly classified, but they have non-zero § s. Thus, they contribute to the optimization 
equation but should be treated as less important than the points in regions C and D, since they are correctly classified. 
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The more near to the hyperplane w- z + b > 0, the more important in the next training procedure to achieve a better 
classification result. Given d=w- z + b , where z= 0 (x) for input point x, the membership for region B is defined as:Se = 
Si + (1 - d) x S 2 (15) where S 2 > 0, 0 < si + s2 < 1 and 0 < d <1 in region B. 

5. The points in region C are incorrectly classified. It can be predicted that in the next training procedure, the hyperplane 
can move towards these points, thus allowing more of them can be classified correctly. The nearer the points to the 
hyperplane w- z + b > 0 , the less important they are in the next training procedure. As explained in step 4, however, they 
are more important than the points in region B. Using the same notation as step 4, the fuzzy membership for region C is 
defined as S c = (S L + S 2 } + \d\ x (16) 

where S 3 > 0, 0 < si + s 2 + S 3 < 1, and -1< d < 0 in region C. 

6 . The points in region D are incorrectly classified. The further away the points are from the hyperplane w- z + b > 0 , the 
more probably an outlier exists; thus, the smaller membership should be assigned. The membership for region D is 
defined as: S D = (Si + S 2 + S 3 )/ldl k (17) 

where k>0 and d < -1 in region D. Here, k is a positive integer, and the larger k is, the faster the membership decreases 
with the increase of distance d. The value of k is chosen as 9 in the experiment. With the memberships defined in steps 3- 
6 , an FSVM is trained and the obtained FSVM is used as a classification tool. Above, are the steps to design the proposed 
membership function. 

3.3. Integrating Neutrosophic Set with Reformulated SVM 

In order to use the reformulated SVM, a weighting function for input samples should be defined. Following the steps in Section 

4.4, every sample has been associated with a triple <tj,ij,fj> as its neutrosophic components. A larger tj means the sample is nearer 
to the center of the labeled class and is less likely an outlier. So, tj should be emphasized in the weighting function. A larger ij 
means the sample is harder to be discriminated between two classes. This factor should also be emphasized in the weighting 
function in order to classify the indeterminate samples more accurately. A larger fj means the sample is more likely an outlier. 
This sample should be treated less importantly in the training procedure. Based on these analyses, the weighting function gj is 
defined as: 

g j = t j + ij — f j (18) 

The proposed classifier, denoted as neutrosophic -support vector machine (N-SVM), reduces the effects of outliers in the training 
samples and improves the performance when compared to a standard SVM. 

4. About the Dataset 

With the enormous growth of computer networks usage and the huge increase in the number of applications running on top of it, 
network security is becoming increasingly more important. As it is shown in [Landwehr et al. 1994], all the computer systems 
suffer from security vulnerabilities which are both technically difficult and economically costly to be solved by the manufacturers. 
Since 1999, KDD’99 [KDD Cup. 1999] has been the most wildly used data set for the evaluation of anomaly detection methods. 
This data set is prepared by Stolfo et al. 2000 and is built based on the data captured in DARPA’98 IDS evaluation program 
[Lippmann et al.,2000]. 

5. Performance Metrics 

The following are the performance metrics used to evaluate the performance of the NFSVM, HID (Reda Elbasiony et al.2013) and 
k-means (Muda et al.,2011). The classification accuracy, sensitivity and specificity can be calculated using the following metrics. 

• True Positive: A legitimate attack which triggers an IDS to produce an alarm. 

• True Negative: An event when no attack has taken place and no detection is made. 

• False Positive: An event signaling an IDS to produce an alarm when no attack has taken place 

• False Negative: When no alarm is raised when an attack has taken place. 

6. Results and Discussions 

Figure 4, Figure 5, Figure 6 , Figure 7 depicts the True Positive, True Negative, False Positive, False Negative classification of the 
algorithms such as K-Means (Muda et al.,2011), HID (Reda Elbasiony et al.,2013) and NFSVM (proposed work i.e in chapter 3). 
It can be clearly understood that the proposed work NFSVM provides better results than the existing. 




Figure 4: True Positive Analysis 
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Figure 5: True Negative Analysis 





Figure 7: False Negative Analysis 




Figure 8 depicts the sensitivity of the algorithms such as K-Means (Muda et al. 2011), HID (Reda Elbasiony et al.,2013) and 
NFSVM (proposed work i.e in chapter 3). It can be clearly understood that the proposed work NFSVM provides better sensitivity 
result 98.9 respectively. 



SPECIFICITY(%) 




K-Means HID NFSVM 

Algorithms 



Figure 9: Specificity Analysis 



Figure 9 depicts the specificity of the algorithms such as K-Means (Muda et al., 2011), HID (Reda Elbasiony et al.,2013) and 
NFSVM (proposed work i.e in chapter 3). It can be clearly understood that the proposed work NFSVM provides better specificity 
result 77.78 respectively. 
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ACCURACY(%) 
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Figure 10: Accuracy Analysis 

Figure 10 depicts the accuracy of the algorithms such as K-Means (Muda et al., 2011), HID (Reda Elbasiony et al., 2013) and 
NFSVM (proposed work i.e. in chapter 3). It can be clearly understood that the proposed work NFSVM provides better accuracy 
result 97 respectively. 

7. Conclusion 

In this research work fuzzy logic based neutrosophic classifier is applied in misuse, and anomaly detection. To address the 
problems of rule-based systems, the fuzzy logic based neutrosophic classifier is employed to build patterns of intrusions. By 
learning over training data, the proposed algorithm can build the patterns automatically instead of coding rules manually. The 
proposed approaches are implemented using MATLAB. The implementations are evaluated over KDD’99 dataset, and the 
experimental results show that the performances of our approaches are better than the best KDD’99 results. 
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